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Abstract 

The goal of this work is to study how increased variability in the de¬ 
gree distribution impacts the global connectivity properties of a large 
network. We approach this question by modeling the network as a 
uniform random graph with a given degree sequence. We analyze the 
effect of the degree variability on the approximate size of the largest 
connected component using stochastic ordering techniques. A coun¬ 
terexample shows that a higher degree variability may lead to a larger 
connected component, contrary to basic intuition about branching pro¬ 
cesses. When certain extremal cases are ruled out, the higher degree 
variability is shown to decrease the limiting approximate size of the 
largest connected component. 


1 Introduction 

Digital communication networks and online social media have dramatically 
increased the spread of information in our society. As a result, the global 
connectivity structure of communication between people appears to be bet¬ 
ter modeled as a dimension-free unstructured graph instead of a geomet¬ 
rical graph based on a two-dimensional grid, and the spread of messages 
over an online network can be modeled as an epidemic on a large random 
graph. When the nodes of the network spread the epidemic independently 
of each other, the final outcome of the epidemic, or the ultimate set of nodes 
that receive a message, corresponds to the connected component of the ini¬ 
tial root node in a randomly thinned version of the original communication 
graph called the epidemic generated graph [BST14]. This is why the sizes of 
connected components are important in studying information dynamics in 
unstructured networks. 

* A preliminary version of this work was presented at the 12th Workshop on Algorithms 
and Models for the Web Graph (WAW T5), Eindhoven, Netherlands, December 2015. 

^ Aalto University School of Science, Department of Mathematics and Systems Analysis, 
Otakaari 1 Espoo, Finland, math.aalto.fi/en/research/stochastics/ 
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A characterizing statistical feature of many communication networks is 
the high variability among node degrees, which is manifested by observed 
approximate power-law shapes in empirical measurements. The simplest 
mathematical model that allows to capture the degree variability is the so- 
called configuration model which is defined as follows. Fix a set of nodes 
labeled using [n] = {1, 2,..., n} and a sequence of nonnegative integers dn = 
{dn{i), ■ ■ ■, dn{n)} such that in = is even. Each node i gets 

assigned dn{i) half-links, or stubs, and then we select a uniform random 
matching among the set of all half-links. A matched pair of half-links will 
form a link, and we denote by Xij the number of links with one half-link 
assigned to i and the other half-link assigned to j. The resulting random 
matrix (Xij) constitutes a random undirected multigraph on the node set [n]. 
This multigraph is called the configuration model generated by the degree 
sequence dn- The multigraph is called simple if it contains no loops {Xi^i = 0 
for all i) and no parallel links {Xij < 1 for all i,j). The distribution of the 
multigraph conditional on being simple is the same as the distribution of the 
uniform random graph in the space of graphs on [n] with degree sequence dn 
[vdH14, Prop. 7.13]. 

A tractable mathematical way to analyze large random graphs is to let 
the size of the graph grow to infinity and approximate the empirical degree 
distribution of the random graph 

1 "" 

Pn{k) = - ^ l{dn{i) = k) 
i=\ 

using a limiting probability distribution p on the infinite set of nonnegative 
integers Z_|_. One of the key results in the theory of random graphs is the 
following, first derived by Molloy and Reed [MR95, MR98] and strengthened 
by Janson and Luczak [JL09]. Assume that the collection of degree sequences 
[dn) is such that the corresponding empirical degree distributions satisfy 

Pn{k) —)• p{k) for all A: > 0, 
supm 2 (pn) < oo, 

n 

and that p{2) < 1 and 0 < mi{p) < oo, where mfip) = denotes 

the ith moment p. Then [JL09, Thm 2.3, Rem 2.7] the size of the largest 
connected component ICmaxI in the configuration model multigraph (and in 
the associated uniform random graph) converges according to 

n"^|Cmax|-t Ccm(p) (in probability), (1.2) 

where the constant Ccm(p) £ [0,1] is uniquely characterized by p and satisfies 
Ccm(p) > 0 if and only if m 2 {p) > 2mi{p). The above fundamental result 
is important because it has been extended to models of wide generality (e.g. 
[BJRll]). 
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Most earlier mathematical studies (and extensions) have focused on es¬ 
tablishing the phase transition (showing that there is a critical phenomenon 
related to whether or not Ccm(p) > 0), and studying the behavior of the 
model near the critical regime. On the other hand, for practical applications 
it may crucial to be able to predict the size of Ccm(p) based on estimates 
of the degree distribution p. This paper aims to obtain qualitative insight 
into this question by studying properties of the functional p i— ?• Ccm(p) in 
detail by analyzing its sensitivity to the variability of p. As a robust tool 
for comparing levels of variability, we apply stochastic convex ordering tech¬ 
niques. Our main results are Theorem 4.1, which confirms that a higher 
degree variability decreases the limiting component size when certain special 
cases are ruled out, and Theorem 4.6 which confirms the same for sufficiently 
supercritical mixed Poisson distributions with heavy tails. We also provide 
counterexamples which show that, rather counterintuitively, a higher degree 
variability may lead to a larger connected component in some special cases. 
Despite the vast literature on the asymptotics of configuration models (cf. 
[vdH14] and references therein) and numerous works on the stochastic order¬ 
ing properties of branching processes (eg. [Pel07, SK14, VYZ14]), this paper 
appears to be the first of its kind to study the size biasing effects prominent in 
most random graph models of interest using stochastic ordering techniques. 

The rest of the paper is organized as follows. In Section 2 we introduce 
notations and recollect basic results of size-biased distributions relevant to 
configuration models. Section 3 summarizes various stochastic ordering no¬ 
tions related to branching processes. The key contributions of the paper 
are in Section 4, which contains the main results and their proofs, together 
with counterexamples and numerical experiments. Section 5 concludes the 
paper. 


2 Branching functional of the configuration model 

2.1 Size biasing and downshifting 

The configuration model, like many real-world networks, exhibits a size-bias 
phenomenon in degrees, in that “your friends are likely to have more friends 
than you do”. The size biasing of a probability distribution p on the nonnega¬ 
tive real line M+ (or a subset thereof) with mean mi{p) = f xp{dx) G (0, oo), 
is the probability distribution p* defined by 


p*{B) 


Ib xpjdx) 
mi{p) 


B C M+. 


Furthermore, the downshifted size biasing of p, denoted p°, is defined as the 
distribution of X° = X* — 1 where X* is a random number distributed ac¬ 
cording to p* . Note that if X and X* are random numbers with distributions 
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and fjL*, respectively, then 


( 2 . 1 ) 


^ ^ EX 

for any real function (p such that the above expectations exist. 

For probability distributions on Z+ = {0,1,2,...} we use the same sym¬ 
bol p both for the distribution and its probability mass function. Then the 
size biasing and the downshifted size biasing of p can be represented as 

p*{k)=’^^ and p°{k) =p*{k + l). (2.2) 

mi{p) 

If Gp{s) = denotes the generating function of p, then the gen¬ 

erating functions of p* and p° are given by 

Gp.(s) = ^G;(s) and Gpo(s) = (2.3) 

mi[p) ^ mi[p) ^ 

Table 1 below summarizes the size biasings of some common probability 
distributions that are used in the sequel. Here 5x denotes the Dirac point 
mass at x, Bin(n,p) and Poi(c) refer to the standard binomial and Poisson 
distributions, and MPoi(/i) denotes the p-mixed Poisson distribution on Z+ 
with probability mass function 

p(k) = / e~^—p(dX), fe G Z+. 

Jm+ 

Moreover, Par(a, c) is the Pareto distribution with shape a > 1 and scale 
c > 0 (density function f[t) = t > c), and LNor(6, cr^) is the 

lognormal distribution with location 6 G M and scale u > 0 (density function 
f{t) = ^ exp(—) for t > 0 ) [AGK15, Equation (45)]. 


Distribution 

Size biasing 

Downshifted size biasing 

Sx 

Sx 

Sx-l 

Bin(n,p) 

Bin(n — l,p) -|- 1 

Bin(n — l,p) 

Poi(c) 

Poi(c) -|- 1 

Poi(c) 

MPoi(p) 

MPoi(p*) -|- 1 

MPoi(p*) 

Par(a, c) 

Par(a — 1, c) 

Par (a — 1, c) — 1 

LNor(6, cj^) 

LNor(6 -|- (T^, cj^) 

LNor(6 -|- cr^, CT^) — 1 


Table 1: Size biasings of some common probability distributions (p ± 1 is 
shorthand for the distribution of X ±1 with X distributed according to p). 

In Section 4.3 we analyze in detail a class of Pareto-mixed Poisson dis¬ 
tributions MPoi(Par(a, c)) parameterized by a > 1 and c > 0. This class 
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serves as a convenient benchmark for degree distributions of random graphs 
because it inherits the heavy-tailed behavior of the Pareto distribution, and 
because its downshifted size biasings are given in a simple form by 

MPoi(Par(a, c))° = MPoi(Par(Q;, c)*) = MPoi(Par(a — 1, c)). (2.4) 


2.2 Definition of the branching functional 

Given a probability distribution p on Z+, we denote by 

r]{p) = inf{s > 0 : Gp{s) = s} 

the smallest fixed point of the generating function Gp{s) = Ylk>o s^p{k) in 
the interval [0,1]. Classical branching process theory (e.g. [GSOl, vdH14]) 
tells that r]{p) G [0,1] is well defined and equal to the extinction probability 
of a Galton-Watson process with offspring distribution p. We denote the 
corresponding survival probability by 


C{p) = 1 - r]{p). (2.5) 

As a consequence of [JL09, Thm 2.3], the limiting maximum component 
size of a configuration model with limiting degree distribution p corresponds 
to the survival probability of a two-stage branching process where the root 
node has offspring distribution p and all other nodes have offspring distri¬ 
bution p° defined by (2.2). Therefore, the branching functional p i—)• Ccm(p) 
appearing in (1.2) can be written as 

Ccm(p) = 1 - Gp{r]{p°)). (2.6) 


The following two elementary examples will serve to illustrate the non¬ 
convexity for the branching functional (see Remark 2.3). 

Example 2.1. For the Dirac measure 6n at an integer n > 1 we have 


CcM((^n) 


0, n = 1, 

1, n>2. 


(2.7) 


To verify this it suffices to note that CcM(<5n) = 1 — Gs^ir]{d^)) = 1 — 
where piSn) — vi^n-i) equals one for n = 1 and zero for n > 2. 

Example 2.2. For the uniform distribution on {l,n} we have 


CCM 


2 ^' + 2 ^^ 


0, n = 1, 2, 

l-o(l), n > 2. 


( 2 . 8 ) 


To verify this, an elementary computation shows that the downshifted size 
biasing ol Pn = equals 


Pn = 


n + 1 


5o + 


n 


n + 1 


^n-l- 
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For n = 1,2 the support of is contained in {0,1} which implies that 
v{Pn) — 1- On the other hand, because Gpo(s) —)■ 0 uniformly on [0, sq] for 
all So < 1) it follows that = o(l) as n —)• oo. Hence the fact that 

Gp„(s) < s implies Gp„(7?(p°)) = o(l), and we may conclude (2.8). 

Remark 2.3. The functional p i—>■ Ccm{p) is not convex nor concave. To 
see why, consider the probability distribution ap + {1 — a)q where a = 1/2, 
p = hi, and q = 5n for some integer n > 1. Then by formula (2.7), 

«Ccm(p) + (1 - a)CcM(g) = ^ for all n > 2, 
whereas by formula (2.8), 

Ccuiap + {1 - a)q) = 

[l-o(l), n > 2. 


2.3 Continuity properties 

We will next prove a continuity property of the branching functional which 
is needed for proving Theorem 4.6, one of the main results of this paper. 
For probability distributions p, pi, fi 2 , ■ ■ ■ on M+, we denote convergence in 
distribution by pn A- p. For probability distributions on ]R_|_ with hnite first 
moments, we denote pn p if Pn converges to p in distribution and {pn) 

is uniformly integrable. Recall [AGS08, Theorem 7.1.5] that pn p is 
equivalent to convergence in the Wasserstein metric dehned by 

dw{p,y) = inf / \x-y\'y{dx,dy), 

J M_|_ X IR_|_ 


where the inhmum is computed over the set of all couplings of p and v. See 
[LV13] for a probabilistic discussion about the Wasserstein metric. 

Theorem 2.4. Let p,pn be probability measures on Z+ each with a finite 
nonzero mean. If Pn p and p{l) > 0, then CcuiPn) Ccm{p)- 

The proof of the theorem is based on the following two lemmas. The 
hrst states that convergence in the Wasserstein metric implies convergence 
in distribution for associated size biasings. 


Lemma 2.5. Let p, pi, p 2 , ■ ■ ■ be probability measures on M+ each with a 
finite nonzero mean. If pn p, then /r* A p* and p^ p°. 

Proof. Uniform integrability and pn p imply [Kal02, Lemma 4.11] that 
m{pn) —>■ m{p). If (p : M+ —>■ M is continuous and has compact support, then 
'ip{x) = x<p{x) is continuous and bounded, and hence 


L*n{4>) 


m{pn) 


m{p) 


p*{(fi). 


6 




We conclude that /i* —)• vaguely. In addition, the uniform integrability 
of {fjLn) implies tightness of and we may conclude [Kal02, Lemma 5.20] 

that ■ The fact that /i° follows from the continuous mapping 

theorem [Kal02, Theorem 4.27]. □ 

The following result on the continuity of extinction probabilities is prob¬ 
ably well known. Because we did not find it in the literature, the proof is 
included in Appendix A for reader’s convenience. 

Lemma 2.6. Let p,Pn be probability measures on Z+. If Pn ^ P CLnd p{0) > 
0, then p{pn) ^ vip)- 

Proof of Theorem 2.4- By Lemma 2.5, A p°. Moreover, p°{0) = > 

0. Hence rjfpf) —)• rj{p°) by Lemma 2.6. The assumption that Pn P 

also implies that Gp^ —>■ Gp uniformly on [0,1], as explained in the proof of 
Lemma 2.6. Hence 

CcuiPn) = 1 - Gp„(r/(p°)) ^ 1-Gp{7]{p°)) = Ccm(p)- 


□ 


2.4 Upper bounds 

A simple closed-form expression for ^CM (p) is not readily available due to the 
implicit definition of p{p°). To get a qualitative insight into the behavior of 
Ccm(p) as a functional of p, analytical bounds will be valuable. The following 
result presents a fundamental upper bound which only depends on the mean 
degree distribution. This result is implicitly contained in the proof of [BT12, 
Theorem 2]. Here we provide a short and transparent proof. 

Proposition 2.7. For any probability distribution p with a finite nonzero 
mean \, 


Ccm(p) < 2' 


(2.9) 


Proof. Denote z = p{p°), so that by definition, Gpo[z) = z. Moreover, the 
convexity of Gpo implies that Gpo[s) < s for all s G [z, 1]. Next, by applying 
(2.3) we can write 

Gp{s) = 1 — A / Gpo(s)ds. 


Hence, 


Ccm{p) = l-Gp(z) = A / Gpo{s)ds, 
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and we may conclude that 


Ccm(p) / sds < X sds 


A 

2 ' 


□ 


The upper bound in Proposition 2.7 tells nothing for graphs of mean 
degree two or higher. The following result provides a crude upper bound 
applicable also for A < 2. Similar bounds for standard branching processes 
have been derived in [SK14, VYZ14]. 

Proposition 2.8. For any probability distribution p on Z_|_ with a finite 
nonzero mean X, 

Ccm(p) < l-p(O)-^. (2.10) 

Proof. Let p° be the downshifted size biasing of p dehned by (2.2). Because 
a branching process with offspring distribution p° goes extinct at the hrst 
step with probability p°{0), it follows that 

h(p°)>P°(0) = ^. 

Together with Gp{s) > p{0) +p(l)s, this shows that 

n(l)^ 

G'p(r?(p°))>p(0) + ^. 

The above inequality substituted into (2.6) implies (2.10). □ 


The following result provides a more accurate upper bound of Ccm(p) 
based on A,p(0),p(l),p(2). Similar techniques may be applied to derive 
more accurate upper bounds when a larger collection of low values of the the 
probability mass function of p are known. 


Proposition 2.9. For any probability distribution p on Z_|_ with a finite 
nonzero mean X, 

Ccm(p) < 1-p(0)-p(l)a-p(2)a^ 


where a = 


A-2p(2) • 

Proof. Observe that Gpo{s) > p°(0) + p°(l)s implies 

V{P°) = Gp°iviP°)) > P°{0) +P°{l)r]{P°), 

so that 

p°(0) A“^p(l) 


r]{p°) > 


= a. 


l-p°{l) l-2X-^p{2) 

Then by (2.6) and the monotonicity of Gp we hud that 

Ccm(p) = 1-Gp{ri{p°)) < 1-Gp{a). 
Hence the claim follows by Gp{a) > p{0) +p{l)a + p(2)a^. 


( 2 . 11 ) 


□ 
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2.5 Thinning 

The study of percolation and epidemics on random graphs reqnires the anal¬ 
ysis of thinned degree distribntions (see Section 4.3). The r-thinning of a 
probability distribntion p on Z_|_ with r G [0,1] is the probability distribntion 
TrP on Z+ with probability mass fnnction 

Trp{k) = 

and generating fnnction 

GTrp{s) = Gp o GBer(r)('S) = Gp(l — r-|-rs). 

The r-thinning of p can be recognized as a mixed binomial distribntion of a 
random integer corresponding to a random vector (Xr,X) where X is p- 
distribnted and the conditional distribntion of X^ given X = n is Bin(n, r). 
Alternatively, if X, 0 i, 62 , ■ ■ ■ are mntnally independent and snch that X is 
p-distribnted and di is Ber(r)-distribnted, then Xr = distribnted 

according to the r-thinning of p. 

Example 2.10. The r-thinnings of Dirac, binomial, and Poisson distri¬ 
bntions are given by Tr{5n) = Bin(n, r), T,-(Bin(n, a)) = Bin(n, ar), and 
r^(Poi(A)) = Poi(Ar). 

Lemma 2.11. The downshifted size biasing and r-thinning operations com¬ 
mute according to {Trp)° = Tr{p°). 

Proof. Becanse G’Ber(r)('®) ~ ~ ^ ~ 

Gxrpi^) = Gp{GBer{r){s))r. 

Using this formnla together with (2.3) and m{Trp) = rmi{p), we see that 

^ ( \ _ _ ^p(^Ber(r)(®)) ^ _ G*p(<^Ber(r) (®)) 

^ m{Trp) rmi{p) 'rni{p) ’ 

and from this we may conclnde that 

G[Trp)°{^) = Gpo(GBer(r)(s)) = G't,.(p°) (■s)- 

□ 


3 Stochastic ordering of branching processes 

3.1 Strong and convex stochastic orders 

The upper bonnd of Ccm(p) obtained in Proposition 2.8 is rongh as it dis¬ 
regards information abont the tail characteristics of p. To obtain better 
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estimates, we will develop in this section techniques based on the theory of 
stochastic orders (see [MS02] or [SS07] for comprehensive surveys). 

Integral stochastic orderings between probability distributions on M (or 
a subset) are defined by requiring 

J 4>{x)fi{dx) < j 4){x)v{dx) (3.1) 

to hold for all functions (/> : M —)• M in a certain class of functions such that 
both integrals above exist. The strong stochastic order is defined by denoting 
h- <st if (3.1) holds for all increasing functions (f). The convex stochastic 
order (resp. concave, increasing convex, increasing concave) order is defined 
by denoting ^ <cx (resp. /r <cv fJ- <icx h <icv if (3.1) holds for all 
convex (resp. concave, increasing convex, increasing concave) functions (j). 
For random numbers X and Y distributed according to /x and zx, we denote 
X <st T if ^ <st zx, and similarly for other integral stochastic orders. 

When X <st Y we say that X is smaller than Y in the strong order 
because then P(X > t) < P(y > t) for all t. When X <cx Y we say that 
X is less variable than Y in the convex order, because then EX = ET and 
Var(X) < Var(y) whenever the second moments exist. Note that X <„ 
Y if and only if X >cx Y, that is, X is less concentrated than Y. The 
order X <icv Y can be interpreted by saying that X is smaller and less 
concentrated than Y. 

3.2 Stochastic ordering and branching processes 

To obtain sharp results for branching processes, it is useful to introduce one 
more integral stochastic order. For probability distributions /U and ly on 
(or a subset thereof), the Laplace transform order is defined by denoting 
h- <Lt ^ if (3.1) holds for all functions (j) of the form (j){x) = —with 
t > 0. Observe that /x <Lt is equivalent to requiring L^{t) > Li,{t) for all 
t > 0, where we denote the Laplace transform of fa by L^{t) = f e~*^/a(dx). 
For probability distributions p and q on Z_|_, observe that p <Lt Q if and only 
if their generating functions are ordered by Gp{s) > Gq{s) for all s G [0,1]. 
Because for any t > 0, the function x i—)■ —is increasing and concave, it 
follows that 

fJ- <st l' => h <icv => h- <Lt 

Due to the above implications we may interpret X <Lt T as X being smaller 
and less concentrated than Y (in a weaker sense than X <!„ Y). 

The following elementary result confirms an intuitive fact that a branch¬ 
ing population with a smaller and more variable offspring distribution is less 
likely to survive in the long run. The proof can be obtained as a special case 
of a slightly more general result below (Lemma 4.5). 
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P 

q 

pO 

q° 

mean 

2.000 

2.000 

1.125 

1.375 

variance 

0.250 

0.750 

0.234 

0.609 

extinction probability p 

0.000 

0.076 

0.333 

0.186 


Table 2: Statistical indices associated to p and q and their downshifted size 
biasings. 

Proposition 3.1. When p <Lt q, the survival probabilities defined by (2.5) 
are ordered according to (^{p) < C{q)- Especially, 

p <st q or p <cv q =► p <icv q =► p <Lt q =► C{p) < C{q)- 


4 Stochastic ordering of the configuration model 

Basic intuition about standard branching processes, as confirmed by Propo¬ 
sition 3.1, suggests that a large configuration model with a smaller and more 
variable degree distribution should have a smaller giant component. The 
next subsection displays a counterexample where this intuitive reasoning 
fails. 


4.1 A counterexample 

Consider degree distributions p and q defined by 

1 6 , 1 

1 , 1 , 5 , 1 1 

q — —On “t” —0i “1“ — Oo ~\~ —0“^ —04, 

H 16 U w g i -r g 4 w g d -r 4, 

where df^ represents the Dirac point mass at point k. Their downshifted size 
biasings, computed using (2.2), are given by 


o 1 . 12^ 3 , 

o 1 . 10. 3 , 2 ^ 


By comparing integrals of cumulative distributions functions [SS07, Thm 
3.A.1] or by constructing a martingale coupling [LV14], it is not hard to verify 
that in this case p <cx q- Numerically computed values for the associated 
means, variances, and extinction probabilities are listed in Table 2. By 
evaluating the associated generating functions at r/(p°) = 0.333 and r]{q°) = 
0.186, we find using (2.6) that Ccm(p) = 0.870 and Ccm(q') = 0.892. 

This example shows that a standard branching process with a less vari¬ 
able offspring distribution (p <cx q) is less likely to go extinct {p{p) < piq)), 
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but the same is not true for the downshifted size-biased offspring distribu¬ 
tions {r]{p°) > r]{q°)). As a consequence, the giant component of a large 
random graph corresponding to a configuration model with limiting degree 
distribution p is with high probability smaller than the giant component in a 
similar model with limiting degree distribution q, even though p is less vari¬ 
able than q. The reason for this is that, even though higher variability has 
an unfavorable effect on standard branching (the immediate neighborhood 
of the root note), higher variability also causes the neighbors of a neighbor 
to have bigger degrees on average. 

4.2 A monotonicity result when one extinction probability 
is small 

The following result shows that increasing the variability of a degree distri¬ 
bution p does decrease the limiting relative size of a giant component, under 
the extra conditions that p(0) = q{0) and that the extinction probability 
related to q° is less than ~ 0.135. Note that in the analysis of configu¬ 
ration models it is often natural to assume that p(0) = q{0) because nodes 
without any half-links have no effect on large components. 

Theorem 4.1. Assume that p <icv q, p(0) = q(0), and r]{q°) < e“^. Then 
Ccm{p) < Ccm(q')- 

Remark 4.2. Assume that q{l) > 0 and that Ccm{q) > 1 — <?(0) — q{l)e~‘^. 
If this holds, then the inequality Gq{s) > g(0) -|- q'(l)s applied to s = r]{q°) 
implies that 

q{0) + q{l)e~'^ > 1 - Ccmiq) = Gq{ri{q°)) > q(0) + q(l)p(q°), 

so that J](q°) < e~^ as required in Theorem 4.1. 

Theorem 4.1 is a direct consequence (choose £ = 0 below) of the following 
slightly more general result. 

Theorem 4.3. Assume that p <{„ q, p{i) = q{i) for all i G {0,1,2,...,.^} 
and r]{q°) < for some integer i >0. Then Ccm(p) < Ccuig)- 

The proof of Theorem 4.3 is based on the following two lemmas. 

Lemma 4.4. If p <icv q and p{i) = q{i) for all i G {0,1,2, ■■■,£}, then the 
generating functions of the downshifted size biasings of p and q are ordered 
by 

Gpo{s) > Gqo{s) for all s G [0, 

Proof. Fix s G (0, Define a function (fs ■ K+ —t K-i- by 

(psix) = XS^. 
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Denote i = —logs, so that t G [2/(£ + 1),oo). Because = (1 — tx)e 
and <j)'s{x) = {tx — we find that (pg is decreasing on [ _jQgg , oo) and 

convex on [ _iogs ;Oo). Because — log s > it follows that pg is decreasing 
and convex on [£ + 1, oo). Dehne a decreasing convex function by 


i^s{x) = fg{x)l[o^^^) + (j)g{x)l[^^^^){x) (4.1) 

where fg{x) = (j)g{xo) + (l)'g{xo){x — xq) and xq = I + 1 (see Figure 1). 

Let X and Y be random integers distributed according to p and q. By 
assumption p <icv 

E(v^,(x)) >E(V',(y)). 

By the second assumption p{i) = q{i) for all i G {0,1,2,...,.^}, the above 
inequality can also be written as 

OO OO 

^ 'tpg{i)p{i) > 
i=i+l i=i+l 

and hence by (4.1) we hnd that 

OO OO 

^ (pg{i)q{i). 

i=l+l i=l+l 

By applying again the second assumption we obtain 


E(<^,(X)) >E(</.,(y)), 


which implies G'p{s) > G'g{s). Because p <icv q also implies E(y) < E(y), 
we see by (2.3) that 


Gpo (s) 


G'p(s) ^ G'^(s) 
E{X) - E(y) 


G,o(s). 


(4.2) 


Hence the claim is true for all s G (0, e By the continuity of Gp 


and Gqo, the claim is also true for s = 0. 


□ 


Lemma 4.5. If Gp{s) > Gq{s) for all s G [0,7?(g)], then rj{p) > r]{q). 

Proof. The claim is trivial for p(q) = 0, so let us assume that p(q) > 0. Then 
Gq(0) > 0, and the continuity of s i—)> Gq{s) — s implies that Gq{s) > s for 
all s G [0,p{q)). Hence also 


Gp{s) > Gqfs) > s 

for all s G [0,p{q)). This shows that Gp has no hxed points in [0, 77 (g)) and 
therefore 77 ( 73 ), the smallest hxed point of Gp in [0,1], must be greater than 
or equal to 77 (g). □ 
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Figure 1: Function (j)s (blue) and its convex modification ips (red) for t = 3. 

Proof of Theorem 4-3. By applying Lemma 4.4 we see that 

Gp°{s) > Gq°{s) (4-3) 

for all s G [0, The assumption 77 ( 5 °) < further guarantees 

that (4.3) is true for all s G [0,r]{q°)]. Lemma 4.5 then shows that r](p°) > 
r]{q°). Finally, p <icv Q implies p <Lt q, so that Gp{s) > Gq{s) for all 
s G [0,1]. Therefore, the monotonicity of Gp implies that 

Gp{r]{p°)) > Gp{p{q°)) > Gq{p{q°)). 

By substituting the above inequality into (2.6), we obtain Theorem 4.3. □ 

4.3 Application: Social network modeling 

Consider a large online social network of mean degree Aq where users forward 
copies of messages to their neighbors independently of each other with prob¬ 
ability rg. Without any a priori information about the higher order statistics 
of the degree distribution, one might choose to model the network using a 
configuration model with some degree distribution which is similar to one 
observed in some known social network. Because several well-studied social 
networks data exhibit a power-law tail in their degree data, a natural first 
choice is to model the unknown network using a configuration model with a 
Pareto-mixed Poisson limiting degree distribution (recall (2.4) and Table 1) 

Pq = MPoi(Par(Q;, Co)) (4.4) 

with shape a > 1, scale cq = Ao(l — 1/a), and mean Aq. 

Because the above choice of degree distribution was made without re¬ 
gard to network data, it is important to try to analyze how big impact can a 
wrong choice make to key network characteristics. When interested in global 
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effects on information spreading, it is natural to consider the epidemic gen¬ 
erated graph obtained by deleting stubs of the initial configuration model 
independently with probability 1 — ro. The outcome corresponds to another 
configuration model where the limiting degree p = Tr^po is the ro-thinning 
of po defined in Section 2.5. Using generating functions one may verify that 
the r-thinning of a /r-mixed Poisson distribution MPoi(/r) equals MPoi(r^), 
where rp denotes the distribution of a /r-distributed random number mul¬ 
tiplied by r G [0,1]. Because rPar(Q:,c) = Par(a,rc), it follows that the 
Pareto-mixed Poisson distribution is scale-free in the sense that 

Tr MPoi(Par(a, c)) = MPoi(Par(a, rc)). 

See [ALW14] for an insightful discussion on scale-free properties of discrete 
probability distributions. As a consequence, the ro-thinning of pq in (4.4) 
equals 

p = MPoi(Par(a, A(1 — 1/a))) (4.5) 

with A = Aoro. 

Now the key quantity describing the information spreading dynamics of 
the social network model is given by Ccm(p) defined in (2.6). To study how 
sensitive this functional is to the variability of p, we have numerically eval¬ 
uated Ccm(p) for different values of a and A, see Fig. 2. An extreme case is 
obtained by letting a —>• oo which leads to the standard Poisson distribution 
with mean A. Again, perhaps a bit surprisingly, we see that for small values 
of A, a Pareto-mixed Poisson as a limiting degree distribution may produce 
an asymptotically larger maximally connected component in a configuration 
model than a one with a less variable unmixed Poisson distribution with 
the same mean. This phenomenon is most clearly visible when A = 0.9, in 
which case </cM(Poi(A)) = C(Poi(-^)) = 0 but a Pareto-mixed Poisson de¬ 
gree distribution with a heavy enough tail yields nonzero values of CcM; as 
shown by the magenta curve in Fig. 2. For sufficiently large values of A, this 
phenomenon appears not to take place. 

Proving the monotonicity of Ccm(p) for Pareto-mixed Poisson distribu¬ 
tions of the form (4.5) is not directly possible using Theorem 4.1 because 
p(0) is not constant with respect to the shape parameter a. However, the 
following result can be applied here. Let us define a constant 

Acr = inf{A > 0 : AC(Poi(A)) = 2}. 

Because A i—?■ AC(Poi(A)) is increasing (Proposition 3.1) and continuous 
(Lemma 2.6) and grows from zero to infinity as A ranges from zero to infinity, 
it follows that A^ G (2, oo) is well defined. Numerical computations indicate 
that Acr ~ 2.3. The following result establishes a monotonicity result for 
the configuration model with a Pareto-mixed Poisson limiting distribution 
Pa = MPoi(/ro) with pa = Par(a, Cq,), where the scale Cq = A(1 — 1/a) is 
chosen so that the mean of pa equals A for all a > 0 (recall Table 1). 
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Figure 2: Configuration model branching functional Ccm(p) for a collec¬ 
tion of Pareto-mixed Poisson degree distributions with mean A, plotted as a 
function of the shape parameter a > 1. 

Theorem 4.6. For any X > Acr there exists a constant Ocr > 1 such that 


CcmiPa) < Ccm(p^) < CcM(Poi(A)) 


for all Ocr < ct < (3. 

Remark 4.7. Note that (^cM(Poi(A)) = C(Poi(A)) due to the fact that the 
Poisson distribution is invariant to downshifted size biasing (cf. Table 1). 

Proof. Fix A > Acr and denote r/oo = 7 (Poi(A)). Because A > Acr, it follows 
that A(1 — T/oo) > 2, and therefore 



(4.6) 


for some large enough uq > 1 and small enough e > 0. Next, Lemma 4.9 
below shows that = Par(a —1, Ca) —t 6\ and hence also = MPoi(/U* ) —>■ 
Poi(A) in distribution as a —)• oo. The continuity of the standard branching 
functional (Lemma 2.6) implies that ri{p°^) —)■ t/qo, and we may choose a 
constant Ocr > oo such that r/(p°) < r/oo + e for all a > Ocr- 

Assume now that Ocr ^ ot < (3. Then by Lemma B.l we know that 


ha ^cv hfi ^cv ^A- 


(4.7) 
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Furthermore, Cq,q < Ca < cp implies that the supports of /Tq, /i/ 3 , and <5;, are 
contained in [cqq,oo). Lemma 4.8 below implies that Gp^{s) > Gp°{s) > 
Gpoi(A) for all s G [0, sq] where sq = 1 — 2/ca(,. Now (4.6) shows that 


So = 1 - A 


-1 


1 — 1/ao 


> 1 - A ^ (A(l - r/oo) - Ae) 


Voo + e, 


and hence the interval [0, sq] contains both [0, r/oo] and [0, ?/(p^)]. By applying 
Lemma 4.5 twice, it follows that r/(p°) > r]{p°^) > 7/(Poi(A)) = t/oo- 

On the other hand, inequality (4.7) together with [SS07, Thm 8.A.14] im¬ 
plies that MPoi(/iQ,) <icv MPoi(/i/ 3 ) </„ Poi(A). Especially, <Lt Pp <Lt 
Poi(A), so that Gp^ > Gp^ > Gpoi(A) pointwise on [0,1]. This together with 
the monotonicity of the generating functions shows that 


Gpc{p{P°a)) > Gpp{ii{p°p)) > Gpoi(A)(??(Poi(A))), 


and the claim follows by substituting the above inequalities into ( 2 . 6 ). □ 

Lemma 4.8. Let p = MPoi(/i) and q = MPoi(p) where p <icv Assume 
that the supports of p and v are eontained in an interval [c, oo) for some 
c > 2. Then Gp°{s) > Gq°{s) for all s G [0,1 — 2/c]. 

Proof Note hrst that for GMPoi(/.i)('S) = L^(l — s) and recall from Table 1 
that MPoi(/i)° = MPoi(//*). Hence Gpo(s) = L^*{1 — s). Fix s G [0,1 — 2/c] 
and note that Gp°{s) = mi{p)~^ J <f)s{x) p{dx), where </)s(x) = 

Because </)'^(x) = (1 — (1 — s)x)e“P“^)^ and i/>"(x) = (1 — s)((l — s)x — 
2 )e“P“®^*, it follows that the function cfg is decreasing on [ 3 ^, 00 ) and 
convex on [y^, 00 ). Because s G [0,1 — 2/c], it follows that (fs is decreasing 
and convex on the supports of p and v. Therefore p <icv implies f (psdp > 
f fisdv. Because p <icv v also implies that the hrst moments are ordered 
according to mi{p) < mi(p), we conclude that 


Gpo{s) = mi{p) ^ I (ps dp > mi{u) 


-1 


■.du = Gn 


□ 


Lemma 4.9. //cq — ?• A > 0 as a ^ 00 , then Par(a,CQ,) — 5\. 

Proof. Let U he a uniformly distributed random number in (0,1). Then 
Xa = Co(l — [/)“^/“ has Par(Q!, Cq.) distribution for all a. Because Cq, —?• A 
and (1 — —>■ 1, it follows that Xa —)• A almost surely, and hence also 

in distribution. □ 
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4.4 Numerical experiments 

After a detailed analysis of the configuration model branching functional 
for Pareto-mixed Poisson degree distributions, a natural question to ask is 
whether or not similar observations remain valid or other types of distribu¬ 
tions as well. We studied this question by performing numerical experiments 
on two classes of distributions: lognormally mixed Poisson distributions and 
binomial distributions. 

In Figure 3 we have plotted numerically evaluated values of Ccm(p) where 
p = MPoi(LNor(6, (T^)) is a lognormally mixed Poisson distribution with 
scale (T^ > 0 and location b = log A — cj^/ 2 chosen so that the mean of p 
equals A > 0 (Table 1). The curves are plotted as functions of so 

that variability decreases along the horizontal axis. The behavior of the 
branching functional is the qualitatively the same as for the Pareto-mixed 
case: for network models with a small mean degree, higher variability may 
dramatically increase the size of the largest component. 



Figure 3: Conhguration model branching functional Ccm(p) for a collec¬ 
tion of lognormally mixed Poisson distributions with mean A, plotted as a 
function of l/cr^ > 0. 

In Figure 4 we have plotted numerically evaluated values of Ccm(p) where 
p = Bin(n,A/n) is a binomial distribution with mean A, parameterized by 
n > 3. The variance of p equals A(1 — A/n) and increases towards A along 
the horizontal axis. Also in this case with a light-tailed degree distribution. 
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the overall qualitative picture is the same as for the Pareto and lognormally 
mixed Poisson distributions. The one difference is that all curves in Fig¬ 
ure 4 appear to be monotone, either increasing (for small mean degree) or 
decreasing (for large mean degree). In addition, in this case the values of 
A < 1 always produce Ccm(p) = 0, because the downshifted size biasing of 
Bin(n, A/n) equals Bin(n — 1, A/n) and has mean A(1 — 1/n) < 1 whenever 
A < 1. 
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Figure 4: Conhguration model branching functional p i—)• Ccm(p) for a 
collection of binomial degree distributions with mean A, plotted as a function 
of n > 3. 


5 Conclusions 

In this paper we studied the effect of degree variability to the global connec¬ 
tivity properties of large network models. The analysis was restricted to the 
conhguration model and the associated uniform random graph with a given 
limiting degree distribution. Counterexamples were discovered both for a 
bounded support and power-law case that described that due to size biasing 
effects, increased degree variability may sometimes have a favorable effect 
on the size of the giant component, in sharp contrast to standard branch¬ 
ing processes. We also proved using rigorous mathematical arguments that 
for certain natural classes of sufficiently supercritical network models, the 
increased degree variability has a negative effect on the global connectivity. 
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Numerical experiments illustrate that these observations can be detected for 
both light-tailed and heavy-tailed degree distributions. Because most real- 
world social networks have mean degree much higher than one, we do not 
expect to encounter anomalous variability effects in their global connectivity 
structure. However, such effects might be important to take into account 
when studying long-range effects on epidemic generated graphs spanned by 
links over which a rare message or pathogen is transmitted. 
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A A continuity property of branching processes 

Proof of Lemma 2.6. We denote rj = ri{p) and rjn = 6{Pn)- We also denote 
G{t) = Gp{t) and Gn{t) = Gp^(t). Observe first that 

OO 

\Gn{t)-G{t)\ < ^t’^\pn{k) - p{k)\ < 2dtv{pn,p)- 
k=0 

for all t G [0,1], where dtv refers to the total variation distance. Because 
convergence in total variation and convergence in distribution are equivalent 
on countable spaces, it follows that 

lIGn —011= sup |Gn(t) — G(t)| —)• 0. (A-.l) 

ie[o,i] 

(i) Consider first the case where ri{p) G (0,1). Then p(0) -|-p(l) < 1. 
Hence p{k) > 0 for some k >2, and this shows that 

CXO 

G”{t) = J2pik)k{k - > 0 

k=2 

for all t G (0,1). Note that by the continuity of G{t) — t, we see that 

G{t) > t for all t G [0, r]). (A.2) 

We will next show that 

G{t) < t for all t G {rj, 1). (A.3) 

Assume, on the contrary, that G{'q') > rj' for some rj' G (r/, 1). Then indeed 
G(7/') = r/', because by the convexity of G, the point (7y',G(r/')) must lie 
below the line connecting the points (ry, G{rf)) = (ry, rj) and (1, G(l)) = (1,1). 
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But then the points (ri,G{rj), (r]',G{r]')), (1,G(1)) all he on the straight line 
between (ry, ry) and (1,1), and the convexity G implies that G{t) = t on [ry, 1], 
This contradicts the fact that G"{t) > 0 on (ry, 1). Hence we conclude that 
(A.3) is valid. 

Next hx any e > 0 such that (?y — e, ry + e) C (0,1). Then (A.2) and the 
continuity of G{t) — t imply that hi := infig[o,? 7 -e](G^(^) ~ >0. On the 

other hand (A.3) implies that 82 ■= (ry + e) — G(ry + e) >0. By (A.l) we 
may hx no such that \\Gn — G|| < min(hi,h 2)/2 for all n > no- Then for 
any n > no, we hnd that Gn{t) — t > b\l 2 on [0, ry — e] and Gn{r] + e) < 
G(ry + e) + \\Gn — G|| < (ry + e) — 82 / 2 . Hence Gn{t) > t on [0,ry — e] and 
Gn{t) < t for t = ry + e. By the continuity of Gn, we conclude that Gn has 
no hxed points in [0,ry — e] and at least one hxed point in (ry — e,ry + e). 
Therefore the smallest nonzero hxed point of Gn satishes rjn ^ (ry — e, ry + e) 
for all n > no- 

(ii) Consider next the case with ry = 1. Then G(0) > 0 and the con¬ 
tinuity of G imply that G{t) > t on the interval (0,1). Especially, 5 = 
infig[o,i-e] G{t) > 0 for any e > 0. For any large enough n such that 
||G„, — G|| < S/2, it follows that G„(t) > G{t) — \\Gn — G\\ > t + S/2 
for all t G [0,1 — e]. This shows that Gn has no hxed points in [0,1 — e], and 
hence ry„ > 1 — e for all big enough n. □ 

B Stochastic ordering of Pareto distributions 

The following result characterizes stochastic ordering properties of Pareto 
distributions. For i = 1,2, let yr* = Par(ai,Ci) be the Pareto distribution 
with shape a, > 1, scale c* > 0, and mean Aj = Ci{l — l/ai)~^- 

Theorem B.l. For any Pareto distributions fii = Par(ai,Ci) with shape 
Qfj > 1, seale Ci > 0, and mean Aj = Cj(l — l/ai)~^: 

(i) pi <icx F2 if o-'iT'd only if Ai < A 2 and ai > a 2 - 

(ii) yii <cx F 2 if and only if Ai = A 2 and ai > a 2 - 

(Hi) fii <icv h 2 if and only if Ai < A 2 and ci < C 2 . 

Result (i) above quantihes the intuitively natural property that a larger 
mean and a heavier tail makes a Pareto distribution bigger and more variable 
in the increasing convex order. Interestingly, result (hi) may be valid for both 
ai < 0:2 and ai > 02 , depending on the value of the scale parameter. 

Proof (i) Let F~^{s) = Cj(l — be the quantile function of yr*, and 
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denote the upper and lower integrated quantile functions of Hi by 
Hi{t) = I' Fr\s) ds = Ai(l - 
H,{t) = r F-\s) ds = Ai(l - (1 - 


Now by [SS07, Thru 3.A.5, Thru 4.A.3] /ii <icx ^^2 if and only if Hi{t) < 
H 2 {t) for all t G (0,1), that is, 


Ai(l-t)i-^/"i < A2(1 


(B.l) 


for all t G (0,1). If Ai < A 2 and ai > a 2 , the the above inequality holds for 
all t G (0,1), and hence fj-i <icx ^ 2 - 

Assume next that /ii <icx fJ- 2 - Then (B.l) is valid for all t G (0,1). By 
letting t —)• 0, it follows that Ai < A 2 . Inequality (B.l) also implies that the 
fraction 


(1 -t)i-V«i 
(1 - t)i-V«2 


(1 - ()'/ 


0 : 2 — 1/01 


is bounded over t G (0,1). This is possible only if 1 /q; 2 — l/cti > 0, so we 
obtain ai > a 2 - 

(ii) It is sufficient to note that /ii <cx M2 if and only if mi <icx M2 and 
mi(Mi) = mi(^ 2 )- The claim hence follows from (i). 

(hi) Recall [FS04, Thm 2.58] that mi <icv M2 if and only if Hi{t) < H 2 {t) 
for all t G [0, Ij. 

Assume now that mi ^icv M2- Then Ffi(l) < 7 / 2 ( 1 ) implies Ai < A 2 . 
Moreover, i/i(0) = 7 / 2 ( 0 ) = 0 implies that t“^(7/i(t) — 7/i(0)) < t~^{H 2 {t) — 
7 / 2 ( 0 )) for all t G (0,1), and by letting t —)■ 0, we get ci = 7/((0) < 7/^(0) = 
C 2 , so that Cl < C 2 . (Note that this reasoning indeed showed that mi <st M2 
which automatically implies Ml <icv M 2 -) 

To prove the other direction of the claim, let us next assume that Ai < A 2 
and Cl < C 2 . Let us analyze separately the cases ai > 02 and ai < 02 - 


(a) If > 02 , then 


Ff ^(s) = ci(l - s)-^/"i < C 2 (l - ^ F^^{s) 


for all s G (0,1). 

(b) If oi < 02 , then Ai < A 2 shows that 

7 / 2 (t) ^ 1 - (1 - tr ^ 1 - (1 - 

Hi{t) yxj I-{I - t)“i - 1 - (1 - t)“i ’ 

where ai = 1 —l/o*. Now oi < 02 implies ai < 02 , Hence also (1—1)“^ > 
(1 — t)“^, which shows that H 2 {t) > Hi{t) for all t G (0,1). 

□ 
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